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University, Univeristy of British Columbia, University College London and 
University of California, Davis 

GRavitational lEnsing Accuracy Testing 2010 (GREAT10) is 
a public image analysis challenge aimed at the development of al- 
gorithms to analyze astronomical images. Specifically, the challenge 
is to measure varying image distortions in the presence of a variable 
convolution kernel, pixelization and noise. This is the second in a se- 
ries of challenges set to the astronomy, computer science and statistics 
communities, providing a structured environment in which methods 
can be improved and tested in preparation for planned astronomical 
surveys. GREAT10 extends upon previous work by introducing vari- 
able fields into the challenge. The "Galaxy Challenge" involves the 
precise measurement of galaxy shape distortions, quantified locally 
by two parameters called shear, in the presence of a known con- 
volution kernel. Crucially, the convolution kernel and the simulated 
gravitational lensing shape distortion both now vary as a function of 
position within the images, as is the case for real data. In addition, 
we introduce the "Star Challenge" that concerns the reconstruction 
of a variable convolution kernel, similar to that in a typical astronom- 
ical observation. This document details the GREAT10 Challenge for 
potential participants. Continually updated information is also avail- 
able from www.greatchallenges.info. 

1. Introduction. The GRavitational lEnsing Accuracy Testing (GREAT) 
challenges are a series of simulations that provide an environment within 
which image analysis algorithms of importance for gravitational lensing cos- 
mology can be developed. The central theme of GREAT10 is variability, 
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the simulations contain spatially variable quantities, and the challenge is to 
reconstruct the properties of these variable fields to a high accuracy. 

Gravitational lensing is the effect that light rays are deflected by gravity. 
Every galaxy image is distorted by this effect because of mass that is always 
present between the galaxy and the observer. For the majority of galaxies 
this distortion causes a small additional ellipticity called shear. Measuring 
the shear allows us to extract information on the nature of the interven- 
ing lensing mass and the expansion of the Universe [see Bartelmann and 
Schneider (2001) for a technical review], in particular, shear can be used 
to illuminate the nature of dark matter, dark energy and possible devia- 
tions from general relativity. However, to enable gravitational lensing data 
to find this information, the shear needs to be determined to a high degree 
of accuracy. 

The GREAT challenges are designed to aid in the development of algo- 
rithms, that aim to measure the gravitational lensing shear, by evolving 
lensing image simulations in a controlled manner. The first of the GREAT 
challenges, GREAT08 [Bridle et al. (2009)], began with the zeroth-order 
problem of measuring a spatially constant shear in the presence of a spatially 
constant convolution kernel (or Point Spread Function, PSF, impulse re- 
sponse). GREAT08 was inspired by the Shear TEsting Programme [STEP1, 
Heymans et al. (2006); STEP2, Massey et al. (2007)], which was a suite 
of simulations created by the lensing community and analyzed internally. 
GREAT08 was set as a PASCAL 9 challenge to both the astronomy and 
computer science communities, to encourage the development of interdisci- 
plinary approaches to this image processing problem, and was successful on 
various levels [Bridle et al. (2009)]. A high accuracy was achieved under the 
majority of simulated conditions, and there was participation from outside 
the gravitational lensing and cosmology communities; indeed, the winner 
[Hosseini and Bethge (2009)] was not a cosmologist. 

In GREAT10 both the shear and the PSF are spatially varying quantities. 
The shear varies naturally across astronomical images because of the large 
scale distribution of matter in the Universe. The PSF varies spatially in 
images because of atmospheric effects and telescope optics. The primary aim 
for participants in GREAT10 will be to reconstruct the correlation function, 
or power spectrum, of the shear variation in the presence of a known but 
varying convolution kernel. 

GREAT10 is a PASCAL challenge, set by cosmologists to the astronomy, 
computer science and statistics communities. The challenge will be launched 
in late 2010 and will run for 9 months. Algorithms that are successful when 
applied to these simulations will help cosmologists to exploit the scientific 
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Fig. 1. Top: A summary of the simulation and challenge details of GREAT10, which 
will consist of two separate but interconnected challenges, the main "Galaxy Challenge" 
and the "Star Challenge. " 

potential of current and future imaging surveys. These surveys will generate 
petabytes of imaging information ideal for gravitational lensing, and will 
necessitate automated data analysis. 

In this article we will introduce the GREAT10 simulations in Section 2 
and go into some detail in Section 3. We will also outline the submission 
process, and procedures through which we will evaluate results in Section 4. 
In Section 5 we conclude by discussing the scope and context of the sim- 
ulations in relation to real data. We include a number of Appendices that 
contain technical details. We summarize the simulation and challenge details 
of GREAT 10 in Figure 1. 

2. The simulations. In GREAT10 we will introduce variable fields into 
the shear estimation challenge. Both the shear and the PSF will vary across 
the images. In Figure 2 we show how GREAT10 is related to the process 
through which we go from images of galaxies and stars to cosmological pa- 
rameter estimation. There are a number of steps in this process that we will 
discuss in detail in this section. 

To preface this section, we summarize the GREAT10 image simulation 
steps that represent each step of the lensing process: 

• Undistorted Image: We start with an unsheared set of galaxies (with a dis- 
tribution of ellipticities described in Appendix A) or point sources (stars). 
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Fig. 2. GREAT10 introduces two parallel challenges, both based on steps in the analysis 
of real astronomical data. The "Star Challenge" concerns the shapes of scattered stars. 
Each star is a local, noisy realization of the convolution kernel that is inevitably applied to 
any astronomical image ( due to a combination of the Earth 's atmosphere, telescope optics 
and detector imperfections) . This kernel varies smoothly across an image, and the challenge 
is to interpolate measurements between the locations of stars. The "Galaxy Challenge" 
concerns galaxy shapes, which must be measured after effectively deconvolving the image 
from a (supplied) model of the convolution kernel. This builds most closely on the previous 
GREAT08 challenge, but includes the crucial development that both the kernel and the 
distortion applied to the galaxy shapes now vary across an image. The required output 
is a measurement of the correlation between the shapes of pairs of galaxies separated by 
various distances. In a real cosmological analysis, the two procedures are applied in turn, 
then the correlations are compared to predictions of detailed simulations of the Universe. 
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• Shear Field Applied: The galaxy images, but not the star images, are 
transformed using a local distortion matrix that varies spatially across 
the image, a "shear" field. 

• Convolution: Both the sheared galaxy images and star images are locally 
convolved by a Point Spread Function that varies spatially across the 
image. 

• Pixelization: The convolved star and galaxy images are pixelized by sum- 
ming the intensity in square pixels. 

• Noise: Uncorrelated Gaussian distributed (homoskedastic) noise is added 
to all images. The image simulation process will also create Poisson noise. 

Further details are given in Figures 1, 2 and 8. In this section we will describe 
the cosmologically important shear field and the type of convolution kernel 
that images experience. 

2.1. Gravitational shear. As a photon propagates from a galaxy to an 
observer, the path that the light takes is distorted, or lensed, by the presence 
of mass along the line of sight. The first-order effect that this gravitational 
lensing has on the image of a galaxy is to introduce a local distortion that 
can be expressed as a remapping of the unlensed, original, pixels 



(1) 



■^original 
{/original 



1 Q\ _ , . / cos (20) sin(20) 
1 J ~ 151 I sin(20) -cos(2</>) 



(■^observed \ 
^/observed / 



where x and y denote a coordinate system that describes the observed light 
distribution of an object. 10 The second term in this expression is known as 
the shear, and has the effect of stretching an object in a direction parame- 
terized by an angle <p by an amount \g\. 

It is important to note that the factor of 2 in angle means that the shear is 
rotationally symmetric under rotations of 180 degrees. In general, we define 
the shear as a local complex variable 

(2) g = \g\e 2 '^ = | 5 |[cos(20) + sin(20)] = 9l + ig 2 , 

where g\ represents local "+ type" distortions (along the Cartesian axes) 
and gi represents local u x type" distortions (along the 45 degree axes). 
Equation (1) can now be rewritten in terms of g\ and g 2 as 

/ -^original \ ( 1 9l 92 j / ^observed \ 

V 2/original / \ ~9l 1+51/ \ J/obscrvcd / 



We highlight a caveat that, in general, gravitational lensing does also introduce a term 
that alters the observed size of an object. This term modifies equation (1) by introducing 
a multiplication factor (1 — ft) where k is called the "convergence." In GREAT10 we 
explicitly set k = in all images. 
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The amplitude and direction of the shear g that an object experiences de- 
pends on the amount and nature of the lensing matter, rendering this quan- 
tity of great interest to the cosmological community. 

The shear effect, to first order, produces an additional ellipticity in an 
object image; here we define the ellipticity of an object as e = (1 — r)(l + 
r) _1 exp(2r0), where r is the ratio of the major to minor axes in the image, 
and tp is an angle of orientation. A significant characteristic of gravitational 
lensing is that galaxies in general, and in GREAT10, already have an "in- 
trinsic" presheared ellipticity e mtrmsic such that the measured ellipticity per 
galaxy can be expressed as 11 



^intrinsic i „ 

^observed _ ' " ^ ^intrinsic , 

\ _|_ ^*gintrinsic 



where the approximation is true for small shear \g\ <C 1; in GREAT08 and 
GREAT10 the shear has values of \g\ < 0.05. One aspect of the challenge 
for gravitational lensing is that we cannot directly observe the unsheared 
images of the galaxies that we use in the analyses. 

2.2. Variable shear fields. The shear we observe varies spatially as a func- 
tion of position on the sky. This variation reflects the distribution of mass 
in the Universe, which forms a "cosmic web" of structures within which 
galaxies cluster on all scales through the influence of gravity. As we observe 
galaxies, at different positions and at different distances, through this cosmic 
web the light from each galaxy is deflected by a different amount by a differ- 
ent distribution of mass along the line of sight. The effect is that the shear 
induced on galaxy images is not constant across the sky but instead varies 
in a way that reflects the large scale, nonuniform, distribution of mass in 
the Universe. This lensing by large scale structure is known as cosmic shear. 
The measurement of cosmic shear has been one of the major goals of gravi- 
tational lensing and promises to become one of the most sensitive methods 
with which we will learn about dark energy and dark matter. 

The shear field we observe is in fact a spin-2 field. This means that it 
is a scalar field described locally by g\ and 52 components that introduce 
180 degree invariant distortions [the "2" in "spin-2" refers to the factor of 
2 in (1) and (2)]. In this case equation (2) acts as a local coordinate trans- 
formation for each galaxy where gi(O x ,0 y ) and g2(6x,6y) are now variable 
fields as a function of position 12 (8 X , 9 y ). 



11 Here we use |e = (1 — r)(l + for the ellipticity, alternatively, one can use je| = 
(1 — r 2 )(l + r 2 )~ 1 which leads to an extra factor of 2 in (4); see Bartelmann and Schneider 
(2001), Section 4.2, for a detailed discussion. 

12 In GREAT10 we will assume that the shear is constant on scales of the galaxy images 
themselves, such that the lensing effect is a local coordinate transform as in (1). However, 



8 



T. KITCHING ET AL. 



A general spin-2 field can be written as a sum of an E-mode component 
(curl-free) or gradient term, and a B-mode component (grad-free) or curl 
term; E and B are used in analogy to the electric and magnetic components of 
an electromagnetic field. The cosmological effect on the shear field is to intro- 
duce an E-mode signal and only negligible B-mode, which means that a tan- 
gential shear is induced around any region of excess density (see Figure 4) . 

In GREAT10 we will evaluate methods on their ability to reconstruct the 
E-mode variation only. In real observations the unsheared orientations of 
galaxies will in general produce an E- and B-mode in the variation of the 
observed ellipticities [equation (4)]. However, in GREAT10 the unsheared 
population of galaxies will have a pure B-mode ellipticity distribution. This 
B-mode is introduced to reduce "ellipticity noise terms" in the simulations, 
and means that the size of the simulations is significantly reduced; this B- 
mode will not contribute to the E-mode variation with which we will evaluate 
the results. The intrinsic (undistorted) shape of a galaxy is not observable, 
galaxies always have the additional lensing distortion. The uncertainty on 
our knowledge of the intrinsic ellipticity of a galaxy, which is commonly 
related to the intrinsic variance in the ellipticities of the population, is what 
is refered to as "ellipticity noise." We present a technical discussion of this 
aspect of the simulations in Appendix A. 

2.3. Measuring a variable shear field. The galaxies we observe act as 
discrete points at which the shear field is sampled. The amount of shear 
induced on the observed image of an individual object is small, typically 
<3% (a change in the major-to-minor axis ratio of ~1.06) and so techniques 
have been developed to measure statistical features of the shear distribution 
from an ensemble of galaxies. The mean of the shear field (gi) is zero when 
averaged over sufficiently large scales, however, the variance of the shear field 
is nonzero and contains cosmological information. We wish to calculate the 
variance of the shear field as a function of scale (small scale means in close 
proximity in an image); a large variance on small scales, for example, would 
signify a matter distribution with structures on those scales (see Figure 3). 

We can calculate the variance of the shear field as a function of scale 
by computing the two-point correlation function £(A0) of the shear field. 
The correlation function measures the tendency for galaxies at a chosen 
separation to have preferred shape alignment. For any given pair of galaxies 
we can define a component of the shear from each in terms of a tangential 
shear component, which is perpendicular to the line joining the pair, and 
a cross-component which is at 45 degrees, this is meant to isolate the E- and 



we note that a second-order weak lensing effect called flexion [Goldberg and Bacon (2005)] 
is present in real data and in very high mass regions the lensing can even produce arcs 
and multiply imaged galaxies (called strong lensing). 
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Fig. 3. Real lensing effect on an ensemble of galaxy images. The shear distortion 
g = gi + igi2 measured from the shapes of galaxies is represented in the left panel as a tick 
mark in the direction of g, with a length proportional to its magnitude. The amount of 
shear in a particular region of the image is determined by the amount of lensing material 
(dark matter) along the line of sight from the galaxy. Hence, we can reconstruct a map 
of the mass shown in the middle panel. We can also take the two-point correlation func- 
tion of the shear field which is shown in the right panel. The left and middle panels show 
data from the Hubble COSMOS survey [Massey et al. (2007)] and the right panel shows 
a correlation function from the CFHT Legacy Survey [Fu et al. (2008)]. 
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Fig. 4. Schematic effect on an ensemble of galaxy images. The shear distortion varies 
as a function of position because the light propagates through the variable, foreground 
large scale structure (LSS) or "cosmic web." The properties of the LSS imprint specific 
signatures on the type of distortion pattern. For example, if we had a sample of circular 
background galaxies (clear circles, left panels), they would be distorted into ellipses with 
major axes orientated tangent to density contours by a foreground mass overdensity, or 
radial shapes by a mass underdensity. These typically contribute to the cosmic shear on 
different physical scales, and their combined effect is an "E-mode" correlation function that 
varies as a function of scale. However, the LSS does not introduce significant "B-mode" 
or curl distortion patterns (represented by a cross X in this figure). 
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B-mode signal shown in Figure 4. In Appendix B we show how to calculate 
the correlation function from galaxy shears. 

Complementary to the correlation function is the power spectrum C(£), 
which is simply the Fourier transform of the correlation function, 



where £ = (£ x , £ y ) is a 2D wavevector probing scales of order Ad = (2ir)/\£\. 
The shear power spectrum can again be related to the underlying matter 
distribution. In Appendix B we outline a simple method for creating the 
shear power spectrum from the local g± and g<i shear estimators from each 
galaxy. When the simulations are made public we will also provide open- 
source code that will enable participants to create any of these statistics 
from a catalogue of discretely estimated shear values. 

The reconstruction of the shear field variation to sufficient accuracy, in 
terms of the power spectrum or correlation function, is the GREAT10 Galaxy 
Challenge. 

2.4. Simulating a variable shear field. In GREAT10 we will simulate the 
shear field as a Gaussian random field. These fields have a random distribu- 
tion of phases, but a distribution of amplitudes described by the input power 
spectrum. These are simplified simulations relative to the real shear distri- 
bution. In particular, the filamentary structures seen in the "cosmic web" 
are due to more realistic effects, and will not be present in the GREAT10 
simulations. In Figure 5 we show an example of the simulated shear field in 
a GREAT10 image. 

In addition to the cosmologically interesting shear effect, galaxy images 
are smoothed by a spatially variable convolution kernel (PSF). 

2.5. Variable PSF. The images that we use in gravitational lensing anal- 
yses are smoothed and distorted by a convolution kernel (or Point Spread 
Function, PSF). The PSF that these images are convolved with is produced 
by a combination of effects: 

• The images we use are created by observations with telescopes, which 
have a characteristic PSF that can vary as a function of position in the 
image due to the exact optical setup of the telescope and camera. Also, 
the detectors we use, Charge Coupled Devices (CCDs), pixelize the image; 
and defects can cause image degradation. 

• When making observations with the telescopes on the ground the atmo- 
sphere acts to induce an additional PSF (due to refraction in the atmo- 
sphere and turbulence along the path of the photons) . 



(5) 
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Fig. 5. A GREAT10 simulated shear variation. An example of the variable shear field in 
a GREAT10 image. Left panel: we show the simulated mass distribution, which is a Gaus- 
sian realization with a particular power spectrum. Right panel: we show the shear repre- 
sented by a line with length proportional to the amplitude of the shear \g\ and an angle <f>; 
see (1) and (2). 



• A telescope may move slightly during an observation, adding an additional 
smoothing component to the PSF. Typical observations last for seconds or 
minutes during which various effects (such as wind, temperature gradients, 
vibrations, etc.) can cause the telescope to move. 

Each of these effects can vary across an image in either a deterministic fash- 
ion (as in telescope optics) or randomly (as in the atmosphere). In Figure 6 
we show the effect on an ensemble of stars, both schematically and from re- 
alistic simulations. In Figure 7 we show the spatial variation of a simulated 
GREAT 10 PSF. 

We can estimate the local PSF, in the presence of pixelization, from im- 
ages of stars. This is because stars are point-like objects that only experience 
the convolving effect of the PSF and are not subject to the shear distortion; 
the stars we observe are part of our own galaxy, the Milky Way. The prob- 
lem is that we have a discrete number of point-like PSF estimators in each 
image and a spatially varying PSF. What we need is an accurate PSF re- 
construction at nonstar positions (the positions of the sheared galaxies). 

There are currently two broad classes of approach that we define here 
that have been investigated to deal with this sparsely sample variable PSF: 

• Direct modeling uses star images as discrete estimators of a spatially vary- 
ing model (typically some low-order polynomial) and finds a best-fit so- 
lution to this model. 

• Indirect modeling, for example, uses an ensemble of stellar images at dif- 
ferent positions to extract principal components (or eigenvectors) of the 
PSF across the images. 
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Fig. 6. Upper panels: Schematic Effect of atmosphere and telescope optics. A distribution 
of point-like star images is blurred and the resulting pattern has variable spatial structure. 
Lower panels: Real Effect of the atmosphere [from Rome (2010)] and telescope optics [ex- 
ample from Jarvis, Schechter and Jain (2008)] on the PSF ellipticity and size, respectively. 
The atmosphere adds random, coherent, patterns and the telescope adds specific functional 
behavior due to optical effects. 



These techniques, direct and indirect modeling, have been used on gravita- 
tional lensing images, but none have yet reached the required accuracy to 
fully exploit the most ambitious future gravitational lensing experiments. 

In addition, there are a variety of deconvolution algorithims that attempt 
to remove the PSF from images by applying an algorithm that reverses the 
effect of the convolution. The deconvolution approach has been investigated, 
but has not been implemented with gravitational lensing observations to 
date. In Appendix E we outline in more detail these three PSF modeling 
approaches. 

To first order the PSF is commonly parameterized with a size (usually 
defined using a Full- Width-Half-Maximum, FWHM, Gaussian measure) and 
an ellipticity, defined in a similar way to shear 

(6) e = |e|[cos(2i/;) + sin(2^)] = e±+ ie2, 

where again e\ represents "+ type" distortions and e2 represents il x type" 
distortions. In the GREAT10 Galaxy Challenge the PSF will be provided as 
a known function f(e,R) parameterized with a size R(6 x ,9 y ) (FWHM) and 
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Fig. 7. A GREAT10 simulated PSF variation. An example of the variable PSF field 
in a GREAT 10 image. Left panel: we show the simulated magnitude of the ellipticity, 
which contains atmospheric and telescope optics effects. Right panel: we show the ellipcity 
represented by a line with length proportional to the amplitude of the ellipticity |e| and an 
angle ip; see (6). 

ellipticity e(9 x ,9y) that will vary across each image, so that any inaccuracies 
caused by PSF misestimation should be removed from the problem. 

However, in addition to measuring shear very accurately, PSF estimation 
is of crucial importance for gravitational lensing. If we cannot characterize 
the PSF sufficiently at galaxy positions, our shear values will be inaccurate. 
This is not addressed in the Galaxy Challenge where the PSF is a known 
function, so here we set the additional task of estimating the PSF at nonstar 
positions as a challenge in itself; the GREAT10 Star Challenge. 

2.6. Summary of effects. To summarize the effects included in the sim- 
ulations, we show the effect induced on an individual galaxy and star image 
in Figure 8. This "forward process," from a galaxy to an image, was de- 
tailed in Bridle et al. (2009). Each galaxy image is distorted by the matter 
distribution, this image is then convolved by a PSF that spatially varies as 
a result of possible atmospheric effects and telescope optics, and finally the 
image is pixelized. Star images experience the convolution and pixelization 
but are not distorted by the shear field. 

The GREAT08 challenge focused on the determination of the shear g, in 
the presence of these effects, by creating images in which a constant shear 
and a constant PSF had been applied to all objects (by creating constant- 
shear images, algorithms designed to estimate the average shear from an 
ensemble of galaxies can be tested). In GREAT10 we will move toward the 
more demanding, and more realistic, regime of variable shear fields. 

In gravitational lensing analyses to date (c. 2010), we estimate the local 
shear from individual galaxies and use these to reconstruct the two-point 
statistics of the shear field. In Appendix D we review current methods that 
are used to estimate the shear from galaxies [including some advancements 
made since GREAT08; Bridle et al. (2009)]. 
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Fig. 8. Summary of the main effects on an individual galaxy or star. This is the forward 
process described in Bridle et al. (2009), although GREAT10 includes additional realism 
in that both the induced gravitational shear distortion and the PSF vary as a function of 
position. The GREAT10 "Galaxy Challenge" is to estimate the shear distortion applied to 
a galaxy image, correcting for the additional effects in the Earth 's atmosphere and telescope 
optics, which are also experienced by the images of stars; for space-based telescopes the 
atmospheric effects are not present so that telescope and detector effects alone induce 
a PSF. 

3. Simulation details. In GREAT10 the main Galaxy Challenge is to 
reconstruct the shear field in the presence of a variable PSF. We also present 
the Star Challenge that is independent from the main challenge: 

• Galaxy Challenge. This is the main challenge for GREAT10. The objec- 
tive is to reconstruct the shear power spectrum. This is most similar to 
GREAT08. In this challenge the varying PSF will be a known function. 

• Star Challenge. This separate challenge is to reconstruct the PSF at non- 
star positions given an image of PSF convolved stars. 

The GREAT10 structure is schematically represented in Figure 1. 

3.1. Galaxy challenge. In each Galaxy Challenge image there will be 
a different realization of the shear power spectrum. The Galaxy Challenge 
will be subdivided into a low noise (high signal-to-noise) set and a realistic 
(real) noise set. The low noise subset will be closely matched in substructure 
to the real noise set such that participants will be able to analyze whether 
training on such data is of use in shape measurement. There will be fewer 
low noise images than real noise images (similar to GREAT08) to reflect our 
requirements on accuracy (see Section 4.2). 

For the main Galaxy Challenge the input PSF will be provided as a func- 
tional form that will allow the spatially varying PSF to be reconstructed at 
any spatial position with sub-pixel resolution. Images will be supplied con- 
taining sheared galaxy images, and the approximate positions of each object 
will be provided. The participants will be asked to provide a reconstruction 
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of the shear power spectrum at specified t mode (27r/separation) values. 
Participants will submit either of the following: 

• A "shear catalogue": a value of g± and 32 for every galaxy in each image. 13 

• A correlation function as a function of separation: at the launch of the 
challenge we will specify the exact values and binning required for this 
type of submission. 

• A shear power spectrum as a function of t = 2-7r/separation: at the launch 
of the challenge we will specify the exact values and binning required for 
this type of submission. 

In addition to the innovations of variable fields, we will also make the Galaxy 
Challenge simulation more realistic in that the distribution of galaxy prop- 
erties, for example, size and signal-to-noise, will be continuous distribu- 
tions rather than discrete in some images. The GREAT10 galaxy models 
are similar to those in GREAT08; they will consist of two components, 
a "bulge" and a "disk" (an exponential profile with Q~ kRl/n with n = 1 and 
n = 4, resp.); these components may be misaligned and have varying inten- 
sities. 

3.2. Star challenge. In each Star Challenge image there will be a different 
PSF. These Star Challenge images will be grouped into sets that contain the 
same underlying PSF, except for the presence of a possible random compo- 
nent that will have a different realization in each image, to represent a series 
of gravitational lensing observations. The Star Challenge will only have one 
high signal-to-noise level. In real CCDs some bright stars can "saturate" 
the images and need to be masked, in GREAT10 there will be no saturated 
stars. 

For the Star Challenge, participants will be asked to provide an estimation 
of the PSF at nonstar positions. 

All additional necessary information on the simulations will be provided 
on the GREAT10 webpage at the time of launch. Participants will also be 
provided with training sets for each Challenge that will be exactly represen- 
tative of some subset of the main Challenge — in real data we do not have 
training sets and would have to rely on simulations being accurate enough 
representations of the Universe, however, we invite participants to use the 
training data with the caveat that this may only guarantee unbiased results 
in some subset of the main Challenge. 

4. The challenge. In this section we will summarize some of the practical 
details of the Challenge and define our evaluation procedure. 



Nonzero weights for individual objects will also be allowed. 
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4.1. Challenge details. The GREAT10 competition will run for 9 months. 
The competition will be to achieve the largest quality factors in either the 
Galaxy and Star Challenges. Each challenge will be run as a separate com- 
petition. We will award prizes for the largest average quality factor in the 
Galaxy Challenge (there will only be one prize for all types of submission: 
catalogue, power spectrum, correlation function). We will also award a prize, 
in either the Star or Galaxy Challenge, for a method that performs well un- 
der a variety of simulated conditions or whose innovation is particularly 
noteworthy. There will be a mid-challenge workshop and a final GREAT10 
meeting, where we will present the Challenge prizes. 

Participants will be required to download the simulation data. We will 
provide download nodes, hosted by GREAT10 team members, over various 
continents for ease of accessibility. The simulated data for GREAT10 will 
constitute approximately 900 + 50 GB over the Galaxy and Star Challenges, 
respectively. 

The submission process for the competition will be through a web inter- 
face similar to GREAT08 [Bridle et al. (2009); GREAT08 Handbook], with 
a live leaderboard of average quality factors continuously updated. We will 
also publish a detailed results paper where the performance of methods will 
be shown as a function of various properties of the simulated images. We 
outline some rules of the competition in Appendix F. 

4.2. Challenge evaluation. Each submission will result in a shear power 
spectrum C(£) being calculated, either directly by the participant or inter- 
nally after a shear catalogue or correlation function submission. The submit- 
ted power spectrum will then be compared to the true input power spectrum 
for each image in the simulation and a quality factor calculated. 

The quality factor for an individual image is defined using the differ- 
ence between the submitted power spectrum and the input power spectrum 
[Amara and Refregier (2008)], which is related to variance of the measured 
and true shears. For image i this is defined as 

(7) af = — / £U + i)|C input (^) - C suhmittcd (£)\dln£. 

2vr If . 

This is a quantity that has been used in cosmic shear analyses to gauge 
the impact of systematic effects on cosmological parameter constraints. The 
fact £{£ + 1) comes from summing over density of states in Fourier space. 
For GREAT 10 we define the Galaxy Challenge quality factor as 

(8) Qgreatio = 7—9V) 

where the angular brackets denote an average over all images in the sim- 
ulation. The numerator M will be determined subject to the exact range 
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of scales [Aninj^max] in (7), that will be defined at the challenge launch. 
The winner of the Galaxy Challenge will be the method that results in the 
largest average quality factor over all images. In the final analysis of the 
results we will also consider alternative measures such as the L-l and L- 
2 norm of the shear estimates versus the true shears. The results of the 
Star Challenge will also be gauged by a quality factor, which we outline in 
Appendix E. 

The GREAT 10 quality factor is different from the GREAT08 factor [Bri- 
dle et al. (2009)], which used the root-mean-square of the shear residuals 
as the denominator. The GREAT08 quality factor, and goal, was designed 
primarily with an additive bias g = g + b in mind, as discussed in Bridle et 
al. (2009), b in this case is a bias that is constant with respect to g. How- 
ever, both STEP and GREAT08 have shown that multiplicative correction 
g = g x f{g) is also important, where f(g) can be some function of g (possi- 
bly constant). The GREAT10 quality factor is sensitive to both these effects, 
as well as any misestimation of shear variance contributions (o~g). 

5. GREAT10 simplifications and future challenges. We expect GREAT10 
to have a significant impact on the gravitational lensing and cosmological 
communities, enabling the exploitation of the next generation of experi- 
ments. The simulations we have designed present a unique challenge to 
computer science, statistics and astronomy communities; we require extreme 
accuracy from a very large data set, and have limited training data. 

In GREAT10 we have set the challenge of estimating the variation of the 
gravitational lensing shear signal in the presence of a realistic PSF model. 
By advancing the GREAT challenges in this direction, we have addressed 
some of the simplified assumptions that were made in GREAT08. As a re- 
sult, GREAT10 is a demanding challenge — however, this is only the second 
stage in a series of challenges that will work toward creating realistic gravita- 
tional lensing simulations. Some of the simplified assumptions in GREAT10 
simulations compared to real data include the following: 

• Gaussian Random Field: The shear distribution in GREAT10 is a Gaus- 
sian random field with random phases. In real data the field may have 
non-Gaussian signatures, and nonrandom phase information. 

• Known PSF: With real data we must estimate the PSF and determine the 
shear from each image. In GREAT10 we have separated these problems 
into two challenges; in future GREAT challenges these aspects will be 
combined. 

• Weak Lensing: The shear field in GREAT10 only contains galaxies that 
are weakly sheared (the local distortion entirely described by g\ and 52)- In 
reality, second-order effects are present and galaxies can even be strongly 
distorted (into arcs) or multiply imaged. 
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• Simple Galaxy Shapes: The galaxies used in this Challenge are simple 
relative to real data (similar to GREAT08). 

• Simple Noise Model: The noise in the GREAT10 images will contain 
a Poisson term from the image creation process, that mimic photon emis- 
sion from galaxies and stars, and an additional Gaussian component, to 
mimic noise in detectors. In practice, there are unusable bad pixels which 
have to be flagged or removed. 

• Background Estimation/Heteroskedasticity: The noise in GREAT10 is 
constant across the image, where the data is modeled as a signal plus 
additive noise. Real data will also have an uncertain additive background 
component whose estimation further complicates calculation of uncertain- 
ties. 

• Image Construction: In GREAT10 objects are distributed across the im- 
age with no overlaps. Furthermore, each object will be classified to the 
participant as a star or galaxy. In reality, the identification of objects is 
a further problem, and objects commonly overlap. 

• Masking: The GREAT10 images do not have any data missing. In reality, 
data can be missing or incomplete, for example, due to the layout of 
the CCDs used to create the images. Furthermore, areas of images in 
real data are intentionally masked, for example, around very bright stars 
(that saturate the images) and satellite tracks (that leave bright linear 
trails across images). 

• Intrinsic Ellipticity: In GREAT10 the intrinsic (nonsheared) ellipticities 
have only B-mode correlation (see Appendix C). This is constructed to 
reduce the simulation size to a manageable level. In real data the un- 
sheared galaxies are expected to have a random orientation (equal E- 
and B-mode). There are also secondary effects that act to align galaxies 
that are in close physical proximity which can contaminate the lensing 
signal. 

We envisage that the next GREAT challenge will build upon GREAT10 by 
including one or more of these effects. 

In addition, there are a multitude of further effects that will be present 
in real data, for example, nonlinear CCD responsivity and Charge Transfer 
Inefficiency (CTI), variation in exposure times across a data set and variation 
of the PSF as a function of wavelength, to name a few. A further challenge 
will be to handle the enormous amount of data, of order petabytes, that will 
need to be analyzed over reasonable timescales. 

The ultimate challenge for methods developed and tested on the GREAT 
challenges will be their application to data. In partnership with state-of-the- 
art instrumentation, the GREAT challenges will help scientists use gravita- 
tional lensing to answer some of our most profound and fundamental ques- 
tions about the Universe. 
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APPENDIX A: REMOVAL OF INTRINSIC SHAPE NOISE IN 
VARIABLE SHEAR SIMULATIONS 

In shear simulations we must make some effort to reduce the effect of 
the intrinsic ellipticity noise. This can be understood if we take an example 
requirement on the variance of shear systematics to be a 2 ys = 10~ 7 , which 
is a requirement on the variance in the estimated shear values such that our 
cosmological parameter estimation is unbiased. 

Assuming that the intrinsic (unsheared) ellipticities are independent and 
identically distributed random variables, with a variance a 2 , we will need 
-^gai = (^"e/^sys) 1° re ach this accuracy. 14 If a 2 ~ 0.1 (a typical empirically 
observed quantity), this means ~ 10 6 galaxies are required per shear value 
to reach the systematic floor. In GREAT08, with 6 x 50 shear values over 
9 simulation conditions, this results in a number which is large (2.7 x 10 s 
galaxies) and difficult to analyze in a short timescale. 

In constant shear simulations, as was done in GREAT08, galaxies can be 
created in pairs such that they have the same unsheared ellipticity except 
that one has been rotated by 90 degrees. The 90 degree rotation converts 
an ellipticity to e — > — e. If we have rotated (by 90 degrees) and unrotated 
galaxy pairs, then the intrinsic shape noise cancels to first order and the 
variance on the shear is reduced to g 2 {a 4 /2N) [Massey et al. (2007)]. This 
can be understood if we write the shear estimator for a single object i as 

(9) ~gl l = ^ + P l g l + 0{g 2 ), 

where P is some response factor (a matrix that partially encodes the effect of 
the PSF) and higher-order terms also encapsulate any noise. In a 90 degree 
rotated image the shear estimator can be written as 

(10) ^ = -eL + py + o(<? 2 ). 

It can be easily seen that by averaging the individual shear estimates the 
intrinsic ellipticity contribution cancels. 

In variable shear simulations, like GREAT10, the 90 degree rotation me- 
thod cannot be used. In GREAT10 we require the E-mode correlation func- 
tion or power spectrum to be reconstructed. If we have two images with 90 
degree rotated galaxies, then by taking the correlation of the above shear 
estimates equations (9) and (10) we have 

(11) (g a ~9p)unrot = {e a e p ) + P 2 {gg) + 0(g 4 ), 

(12) (fifa^)rot = Ke p ) + P 2 (gg) + 0(g 4 ). 



14 Since |e| < 1 this argument is approximate, since the distribution is truncated and 
hence not Gaussian. 
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It can be seen from these that we cannot combine correlation function esti- 
mates from rotated and unrotated images to remove the intrinsic ellipticity 
correlation function, and preserving the shear correlation function. This is 
fundamentally because the intrinsic ellipticity correlation function is a ran- 
dom field, with equal E- and B-mode contributions. 

For variable shear simulations we can take advantage of the fact that the 
shear correlation function contains E-mode correlations only. In GREAT10 
the intrinsic ellipticity correlation function will contain B-mode correlations 
only. This means that the correlation function from any image can be written 
as in (11), where now (e a ep) only contains B-mode correlation and (gg) only 
contains E-mode correlation. By taking the E-mode component of the shear 
estimate (gg), we will then eliminate any contribution from the intrinsic 
ellipticity distribution. Thus, the simulation size, like the 90 degree-rotated 
case for constant shear images, will not be determined by the intrinsic ellip- 
ticity variance. 

This step is unrealistic in that real galaxies are not expected to have any 
preferred unsheared correlation (to first order), but is necessary for public 
simulations to make the size of the challenge approachable. 

APPENDIX B: CALCULATING THE SHEAR CORRELATION 

FUNCTION AND SHEAR POWER SPECTRUM FOR 

GREAT 10 

In this Appendix we present methods that can be used to estimate the 
shear correlation function and power spectrum from the individual shear 
estimators from each galaxy. 

At the beginning of the challenge we will provide links to open-source 
code to calculate these various statistics, given a catalogue of shear values 
per galaxy. The Galaxy Challenge submission procedure will allow for either 
a shear catalogue (per galaxy), a correlation function or a power spectrum 
to be submitted. 

Calculating the correlation function. Here we summarize Crittenden et 
al. (2002) (and reference to this article) where shear-correlation statistics 
are presented in detail. 

If we define a Cartesian coordinate system (9 X , 6 y ) on a patch of sky (as- 
sumed small) and (j) is the angle between the line joining two galaxies and the 
x-axis, then the tangential component can be written as g+ = — [gi cos(2<^>) + 
52 sin(2</>)] and the cross term as g x = — [gi sin(2</>) — <?2 cos(2</>)]. We can take 
the correlation function of these two quantities 

ttA0) + = ( g+ (e x ,e y )g + (e x ,e y ;A0)), 

(13) 

£(A0) X = (g x (9 x ,dy)g x (6 x ,0 y ;AO)), 
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where A6 is the average radial (angular) separation of galaxy pairs, and the 
angle brackets represent an average over all galaxy pairs within a consid- 
ered 9. Here we refer to sums over finite- width bins in 9. 

The E- and B-mode correlation functions are related to these observable 
correlations, combinations of £+(A0) and £_(A#), in the following way: 

Ce(A6) = i[£+(A0) + £ X (A0)] + ±V 4 rt+(A0) + £x(A0)], 

(14) 

£b(A0) = i[£ + (Afl) + 6<(A0)] - ivV'K+CAfl) -ex(A»)]. 
The operator V 4 ^ -1 is most easily evaluated in Fourier space 

(15) V\- I g(r) = jr'dr'g(r')g(ry), 
where 

(16) 9(r,r') = / ?—J (kr)Mkr'). 



2tt 

See Crittenden et al. (2002) for more information. In Appendix E we define 
the correlation function in a complementary way in (32), where the binning 
in 9 is more explicit. 

Note that we expect the cosmological signal to contain nearly zero B-mode 
correlations (see Figure 4), but that in GREAT10 the B-modes reveal the 
intrinsic ellipticity distribution by design (see Appendix A). In GREAT10 
the quality factor will be evaluated using the E-mode correlation only. 

Calculating the power spectrum. Recall from (1) that we have two shear 
components and that these vary as a function of position across the fields 
9i(9x,9 y ) and g2(9 x ,9 y ). We can write the "shear" as a complex number 
such that 

(17) 9 = 9i + m, 

and we can Fourier transform the shear in the following way 

(is) 9R{t x ,t v )+ig I {t x ,i v )= MG x ,e y ) + ig 2 {e x ,e y )}B-^ e , 

galaxies 

where two new Fourier estimators have been created, a real and an imaginary 
part which are a function of t = (l x , £ y ). These are simply related to the 
original g\ and g 2 by 

9 y ) cos(£ x 9 x + ty9y) 

galaxies 



(19) 



+ g2(9 x ,9 y )sm(£ x 9 x +£ y 9 



y ) °m\yx v x ' y w y ) > 
9i(£ x J y ) = Y 9i{9 x ,9 y ) cos{£ x 9 x + i y 9 y ) 

galaxies 



gi(9 x ,9 y ) sin(£ x 9 x + £ y 9y j 
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We now have two estimators that are a function of £ x and £ y , with each 
point in the (£ x , £ y ) plane being a sum over all galaxies. 

In cosmology we wish to decompose the shear field into an E- and a B- 
mode; cosmological structures should only create an E-mode signal. To 
make this decomposition, we can describe the (£ x , £ y ) by an angle 4>i = 



tan 1 (£ y /£ x ) and a scalar £ = ^j£ x + £ y . The E- and B-mode fields can now 
be written as a simple rotation of the Fourier plane 
E(£ x ,£ y ) + iB{£ x ,£ y ) = [cos(2&) - isin(2&)] 

x [gR(£x,£y) + igi(£x,£ y )}, 

(20) 

E{£ x ,£y) = cos(24>i)g R {£ x ,t y )+sin{2^)g I {£ x ,£ y ), 



B(£ x ,£ y ) =cos(2(j) £ )g I (£ x ,£y) - sm(2(pi)g R (£ x ,£ y ) 
can be 

E{£ Xl £y 



Finally, the shear power spectrum can be defined as the modulus of the 
E-mode field 



(21) C{1) 2i 

where in practice the C{£) is the average over some bin in \£\. 

For data that is distributed as a grid in 9 X and 9 y , the above calculations 
can be simplified even further written as a series of FFTs: 

• Make a 2D FFT of the shear field g{9 x ,9 y ) G(£ x ,£ y ). 

• Construct a 2D ^-matrix £ = (2tt/6 x ) + i(2n/0y). 

• Rotate the FFT of the shear field Q R (£ x ,£ y ) -> (£*£*/ \£ 2 \)Q (£ x , £ y ) . 

• Inverse FFT the rotated shear field back to real space Q R (£ x ,£ y ) — > E(9 X , 
6y) + \B{9 x ,9 y ). 

• Select the real part, E-mode and FFT E(9 x ,6 y ) —> £(£ x ,£ y ) (which is now 
complex) . 

• Calculate the modulus \£(£ x ,£ y )\ 2 = Re[£ (£ x ,£ y ) 2 ] +lm{£(£ x ,£ y ) 2 ] and az- 
imuthally bin in \£\ to find the power spectrum. 

At the launch of GREAT10 we will specify exactly which binning scheme 
in £ x , £ y and £ we will use to calculate the result, and we will also specify 
the angular binning 9 we require for correlation function submissions. 

Note that the method described in this section will not work in the case 
that images are masked. In Section 5 we describe how masking is present in 
real data and may be included in future GREAT challenges. 

APPENDIX C: OVERVIEW OF EXISTING SHAPE 
MEASUREMENT METHODS 

We refer the reader to the GREAT08 articles, Bridle et al. (2009), for 
a comprehensive review of shape measurement methods up until late 2009. 
Since the publication of the GREAT08 results paper and up until the ere- 
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ation of this Handbook, mid-2010, there have been several ongoing activities 
in the field. 

The GREAT08 simulation details have been made public, both shear val- 
ues (answers) for the simulations have been made available and the prop- 
erties of each individual object in the simulations (these can be found by 
following the relevant links from http://www.greatchallenges.info). 

In the interim period a new blind realization of GREAT08 has been re- 
leased, GREAT08reloaded. This simulation is exactly the same as GREAT08 
except that the true shears (answers) have new values. These simulations 
can be used as a semi-blind challenge such that developing algorithms can be 
tested on new simulated data. A new online quality factor calculator is avail- 
able, but we do not show an online ranking at http : / / www . greatchallenges 
info. Authors of newly developed algorithms are encouraged to publish new 
quality factor results from these simulations (please inform T. Kitching and 
S. Bridle if you intend to publish results based on GREAT08reloaded) . 

Some new methods have been published in this interim period. Bernstein 
(2010) has presented an investigation into several effects, related to model 
fitting algorithms, that impact shear measurement biases. Zhang (2010) has 
investigated mapping from ellipticity to shear and has presented a number of 
shear estimators with varying success. Gruen et al. (2010) have used a neu- 
ral net approach, similar to those used for estimating galaxy distances (red- 
shifts), to estimate shear. Melchior et al. (2010) have introduced a model- 
independent deconvolution method. Two of these new approaches, Bern- 
stein (2010) and Gruen et al. (2010), have claimed significantly improved 
GREAT08 quality factors of (Jgreatos ^ 1,000 on the low- noise subsets, 
and Melchior et al. (2010) have claimed Qgreatos ^ 500 on some real-noise 
subsets. 

APPENDIX D: OVERVIEW OF EXISTING PSF MODELING 

METHODS 

In this Appendix we will describe the techniques currently used by the 
gravitational lensing community to characterize the spatial variation of the 
PSF. We classify these approaches under three broad headings, as discussed 
in Section 2.5. 

Direct modeling. The modeling of the spatial variation of the PSF across 
an image can take the form of fitting simple, continuous surfaces to quantities 
that parametrize the PSF at a given location. In this case a surface is fit 
to the image and the best fit is found by determining an extreme value of 
some goodness of fit, usually a minimum chi-square, calculated discretely at 
the star positions. The quantity being fit can either be the pixel intensities 
themselves or some derived quantity such as the ellipticity and Full- Width- 
Half-Maximum (FWHM) of the stars. 



24 



T. KITCHING ET AL. 



For example, in the KSB shear measurement method (see the GREAT08 
Handbook Appendix B) these are usually the two components of the stellar 
anisotropy correction, estimated using the measured ellipticities of stars. An- 
other method is to model PSFs using shapelet basis functions [e.g., Bernstein 
and Jarvis (2002); Refregier (2003); Massey and Refregier (2005); Kuijken 
(2006); Berge et al. (2008)], where at each star a shapelet model is fit and 
it is the spatial variation in each shapelet coefficient that describes the PSF 
surface. In lens&t [Miller et al. (2007); Kitching et al. (2008)] implementa- 
tions to date, the PSF is modeled on a pixel-by-pixel basis where for each 
pixel in a postage stamp a 2D polynomial is fit across the image. PSFEx 
[e.g., Kalirai et al. (2003)] combines several model fitting aspects, allowing 
for various different orthonormal expansions in 2D to be used. Regardless of 
the PSF model parameter being modeled, there is considerable freedom in 
the choice of the functional form of the fitting surface [see, however, Rhodes 
et al. (2007) and Jarvis, Schechter and Jain (2008) for attempts to use PSF 
models motivated by realistic optical patterns]; however, simple bivariate 
polynomials are typically used. 

There are well-known problems with polynomial fitting surfaces, including 
reduced stability at field edges and corners, as the fits become poorly con- 
strained. These have been noted but not necessarily tackled beyond sugges- 
tions of other, perhaps better behaved, functional schemes [e.g., Van Waer- 
beke, Mellier and Hoekstra (2005)]. Alternatively, it has been suggested 
[e.g., Kuijken (2006)] that the images can be Gaussianized to create a bet- 
ter behaved local functional variation of the PSF, at expense of correlated 
noise properties, however, global PSF interpolation is still required for such 
processes. 

Indirect modeling: Principal component analysis (PC A). Indirect mod- 
eling applies to a class of methods that do not model the PSF explicitly, 
for example, using some kind of polynomial, but attempt to characterize 
the variation across the image by finding patterns that are present between 
many realizations of the same underlying PSF. Principal Component Anal- 
yses (PCA; also known as eigenfunction analyses) are an example of this 
kind of PSF modeling that has been implemented by the community. 

The motivation for such techniques is that gravitational lensing images 
(especially at particular positions, for example, away from the plane of the 
Milky Way) often contain only <1 star/arcmin 2 usable for PSF measure- 
ments. If the PSF shows spatial variations on this or smaller scales, they 
cannot be modeled reliably using a standard direct modeling interpolation, 
for example, with polynomial functions. However, these small-scale varia- 
tions may show some degree of stability between different images (exposures) 
obtained using the same instrument, depending on their physical origin. If 
this is the case, one can attempt to combine the information from different 
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exposures — with stars at different pixel positions — to obtain a PSF model 
with improved spatial interpolation. 

A PCA interpolation was suggested for use in gravitational lensing by 
Jarvis and Jain (2004). They used PCA to identify the main directions 
of PSF variation between their exposures. This is typically done by first 
computing a standard polynomial interpolation in each exposure, where the 
polynomial order should be chosen to be sufficiently low to avoid overfitting. 
The PCA is then run on these polynomial coefficients. In this way the recur- 
ring PSF patterns in the different exposures can be described as a superposi- 
tion of principal component patterns, where each exposure is characterized 
by its principal component coefficients. These coefficients yield a sorting 
scheme which enables the desired combination of stars from different expo- 
sures. 

As the final step the stars from all exposures are fit together with a high- 
resolution model. In the description of Jarvis and Jain (2004) this model 
contains one higher-order spatial polynomial for each principal component. 
For the stars of a certain exposure these different polynomial terms are 
weighted according to the principal component coefficients of this exposure. 
Besides the polynomial order, one has to choose the number of included 
principal components. The first principal component is the most important 
one, carrying the largest variation in PSF space. The higher principal com- 
ponents carry less and less variation, and may at some point be truncated to 
avoid the fitting of noise, for example, once 99% of the variation are taken 
into account. 

Some of the principal components (eigenfunctions) can have a physical in- 
terpretation, for example, in relation to the telescope optics (see Section 2.5). 
The principal components of the PSF are expected to have some relation to 
the main physical effects influencing the PSF, such as changes in telescope 
focus, seeing, wind-shake or pointing elevation. However, in detail, the cross- 
identification may be difficult, in particular, for the higher, less important 
principal components. 

The PCA approach does not capture PSF variations which are completely 
random and appear in a single exposure only. On large scales these can 
be accounted for by combining the PCA model with a low-order residual 
polynomial fit, computed separately for each exposure [see Appendix B.5 of 
Schrabback et al. (2010)]. However, on small scales such random variations 
cannot be corrected. 

Also, given that PCA is a linear coordinate transformation, it does not 
efficiently describe possible nonlinear distributions of exposures in the PSF 
coefficient space. Such nonlinear distributions may, in particular, occur if 
PSF quantities with different responses to physical parameters are fitted 
together. For example, telescope defocus changes PSF ellipticity linearly, 
whereas PSF size is affected quadratically. If one aims at a compact PSF 
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description with few principal components, this can partially be compen- 
sated using additional terms in the final PCA fitting step, which depend 
nonlinearly on the principal component coefficients [see Schrabback et al. 
(2010)]. 

Image deconvolution. Astronomical images can be blurred for various 
reasons: for example, atmospheric turbulence for ground-based telescopes, 
or thermal deformations of the telescope through varying exposure to sun- 
light for space-based telescopes. Methods for deconvolution and deblurring 
can often alleviate those effects [e.g., Kundur and Hatzinakos (1996) and ref- 
erences therein]. There are three aspects of the deconvolution process that 
we comment on here: 

• Nonblind vs. blind. In astronomy the point spread function (PSF), which 
describes the blur, can often be estimated from nearby stars, allowing non- 
blind deconvolution to deblur recorded images (called nonblind because 
the PSF is known). However, the PSF might not be available or cannot be 
reliably guessed. In that case blind methods try to recover simultaneously 
the PSF and the underlying image. This is usually achieved by optimiza- 
tion with alternating projections in combination with prior assumptions 
about the PSF and the image. 

• Single-frame vs. multi-frame. Especially important for nonastronomical 
photography is single-frame deconvolution, in which we only observe a sin- 
gle blurry image and would like to recover a deblurred image [e.g., Fergus 
et al. (2006) and Cho and Lee (2009)]. In astronomy we are often able 
to record multiple images of the same object. Combining a large num- 
ber of frames can then recover a single deblurred frame [e.g., by speckle- 
interferometric methods such as Labeyrie (1970); Knox and Thompson 
(1974), or by multi-frame deconvolution methods, e.g., Schulz (1993); 
Harmeling et al. (2009)]. 

• Space-invariant vs. space- variant. A common assumption is that the PSF 
is space- invariant, that is, in different parts of the image, the blur of 
a single star looks the same. For atmospheric turbulence this assumption 
holds only inside the isoplanatic patch, which is a small angular region 
the size of which depends on the amount of air turbulence. For images 
of larger angular regions (beyond the isoplanatic patch), the PSF does 
change across the image and this must be taken into account. Common 
approaches chop the image into overlapping patches with constant PSF 
and interpolate between them [e.g., Nagy and O'Leary (1998); Bardsley 
et al. (2006); Hirsch et al. (2010)]. 

Finally, we note that in astronomical images the presence of significant 
noise (typical objects used for cosmic shear have signal-to-noise ratios of 
~ 10-20) is a challenge to deconvolution algorithms. 
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Other methods. While the direct and indirect modeling methods de- 
scribed so far in this Appendix have been fully tested and applied to real 
data, there are other techniques that have been proposed as potentially use- 
ful for modeling the spatial variation in the PSF. We now discuss two of 
these suggestions in brief. 

Kernel Principal Component Analysis [KPCA: e.g., Scholkopf, Smola and 
Miiller (1998); Shawe- Taylor and Cristianini (2004)], a highly successful tool 
in pattern recognition, has been suggested as a further aid to stable PSF 
modeling. Variation in even basic aspects of a PSF (overall size, ellipticity, 
orientation) may often be difficult to succinctly describe in terms of lin- 
ear combinations of image pixel values. This means that PCA models may 
sometimes require unnecessary degrees of freedom, which impacts upon the 
stability of accurate shear measurement. KPCA allows the Principal Com- 
ponents to be separated in a higher-dimensional feature space, mapped im- 
plicitly via a dot-product kernel, where these nonlinear dependencies can be 
unraveled. 

Another statistical tool that might prove useful in spatial modeling of the 
PSF is known as kriging, commonly employed in the field of Geostatistics 
[see, e.g., Cressie (1991)]. This technique uses a weighted average of neigh- 
boring samples (in this case stars) to estimate the value of an unknown 
function at a given location. Weights are optimized using the location of the 
samples and all the relevant interrelationships between known and unknown 
values. The technique also quantifies confidence in model values. However, 
the accuracy of the method relies critically upon assumptions about the na- 
ture of the underlying variation; commonly-used types of kriging (such as 
Ordinary, Universal and IRFk-kriging) reflect the impact of differing choices 
for these assumptions. A further complication is the limited work done to- 
ward the application of kriging in the presence of noisy sampled data. 

APPENDIX E: STAR CHALLENGE QUALITY FACTORS 

The estimation of the PSF in images has a broad range of applications 
in astronomy beyond gravitational lensing, hence, we will define a quality 
factor that is flexible enough to allow nonlensing applications to gain useful 
information. 

Each participant will submit a high resolution estimate of the PSF at 
each nonstar position that is requested. The quality factor that we will use 
to estimate the success of a method at estimating the PSF will calculate 
the residual (true PSF minus submitted estimate) for each nonstar position 
for each pixel. For astronomical analyses we are concerned with two basic 
parameters that have the highest impact on shape of the PSF: these are the 
ellipticity e of the PSF and the size R of the PSF [Paulin-Henriksson et al. 
(2008)]. These are defined as follows [see Bartelmann and Schneider (2001), 
Section 4.2]. 
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First we define the second-order brightness moments of the image as 

(22) qij = ^ , 1,3 E {1,2}, 

where the sums are over pixels, I p is the flux in the pth pixel and 9 is a pixel 
position (#i = x p and 62 = y p ). We include a weight function w p , that will 
be defined when the simulations are launched, to ensure that the sums over 
pixels converge for the exact PSF model used. We now write the weighted 
ellipticity for a PSF in complex notation as 

(0 o\ e Q11 ~ Q22 + 2igi2 

{26) epsF - 



111 + «22 + 2(^11922 - qj 2 ) 1/2 ' 

where we have used a definition of ellipticity |e| = (l — r)(l + r) _1 which 
is consistent with Section 2; note there is an equivalent expression for |e| = 
(1 — r 2 )(l + r 2 ) -1 [see Bartelmann and Schneider (2001), Section 4.2]. For 
the weighted size we have a similar expression 

(24) Rlsp = q n + q 22 - 

We can calculate the variance between the ellipticity of the model and 
true PSF <r 2 (epsF) = ((epsF — 6 psf) 2 ) an< ^ similarly for the size <t 2 (-Rpsf) = 
((-Rpsf - ^psf) 2 )- 

For cosmic shear studies we have the requirement that we need the resid- 
ual error in the ellipticity and size to be ^10 -3 for the impact on cosmologi- 
cal parameter estimation to be low; for more detail see Paulin-Henriksson et 
al. (2008), Paulin-Henriksson, Refregier and Amara (2009). Hence, we define 
the quality factor for the Star Challenge as 

(25) P= (l/2)( ( T 2 ( J RpsF)/fi 2 + a 2 (epsF))' 

where the angle brackets represent an average over all images. There are fur- 
ther steps that could be taken that can map PSF residuals onto cosmic shear 
requirements [and indeed the Galaxy Challenge Q factor; Paulin-Henriksson 
et al. (2008)]; we will present these in the final GREAT10 analysis. 

In the following sections we outline more details on some additional PSF 
quality factors that we will employ in the final analysis. As well as the 
summed residual between the model and the data, we will investigate at 
least two other approaches that are more closely matched to the gravitational 
lensing requirements (but are less applicable to other fields of astronomy). 

These techniques will not be used to determine the outcome of the com- 
petition or used to create live quality factors during the challenge. 

Azithumal statistics. We can quantify the quality of the submitted PSF 
model in terms of how well a galaxy shear is recovered if this model were 
used. To do this, we will convolve the true PSF with a simple galaxy model, 
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and then use the submitted PSF to recover the galaxy parameters from the 
true PSF-convolved image. This can be done for several angles in an az- 
imuthal bin (ring) (and be repeated for zero shear and nonzero shear) to 
obtain multiplicative and additive errors on the measured shear [m and c; 
see the Shear TEsting Programme; Heymans et al. (2006); Massey et al. 
(2007)]. These m, c values can be converted into a GREAT08 Q value (as- 
suming a constant local shear) . By comparing this ring test with the summed 
residuals, we will be able to correlate the magnitude of the residuals to the 
bias on the shear (i.e., Q). 

Autocorrelations in residuals. The autocorrelation of any continuous 
function h across its n-dimensional domain can be defined as 

(26) [h*h](r) = J /i*(x)/i(x + r) d n x. 

We may also employ the function h(x) to represent a discrete, noisy sampling 
of an unknown "true" field ht(x) such that 

(27) /i(x) = /h(x) + JV(x), 

where N is a stochastic noise term. This description of discrete observations 
by a continuous "quasi-field" is a convenient notational shorthand in what 
follows; one can imagine the observations as a smooth field convolved with 
delta functions at the data locations. The data represented by h(x) can 
be pixel values, complex ellipticities or any general data vector. Similarly, 
a best-fitting model used to characterize these observations can be written as 

(28) / lm (x)=/i t (x) + m(x,/i t ,iV;/ m ), 

where the m = h — h\ is referred to as the inaccuracy in the model [Rowe 
(2010)] and f m labels the specific modeling scheme chosen to represent /i(x). 

The unknown function m will depend nontrivially upon the data, noise 
and modeling scheme used, but not all of its properties are entirely hid- 
den. If we make two simple assumptions about the noise in the data being 
considered, that 

(29) [N*N](r) = 
and 

(30) [h b *N + N*h t ](r)=0, 
it follows that 

(31) \{h- h m )*(h- h m )](r) = [m*m](r) - [m* N + N *m](r). 

As demonstrated in Rowe (2010), this residual autocorrelation function 
should tend to zero for all r if the fitting model employed is both stable and 
accurate: the residuals are then indistinguishable from pure white noise. De- 
partures from this ideal behavior have predictable consequences: overfitting 
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models will show consistently negative values of (h — h m ) * (h — h m ) (since 
[m -k N + N -k m] > m* m > in these cases) , whereas underfitting models 
(for which m-km dominates) may be positive or negative in general. 

If we consider the data in question to be an array of image pixels 
with a corresponding best-fitting model I m , we can construct a practical 
estimator £/ for the autocorrelation function in (31). We do this only in the 
isotropic case = for simplicity, but note that the autocorrelation 
may not be isotropic in general. 

Following Schneider et al. (2002), we estimate the correlation function in 
bins of finite width A9 and define the function Ag(0) = 1 for 9 — A9/2 < </> < 
9 + A9/2 and zero otherwise: thus, Ag((j)) defines the bin at separation 9. 
A simple estimator £/ for (/ — / m ) * (/ — I m ) may then be written as 

(32) ii(9) = j^5>i«>#i(0i) -UOiWjiOj) -I m {0 )]A e {\e t -0 d \), 

P ij 

where the sum is over all pixel pairs, the weight Wi assigned to the zth pixel 
may be used to account for variations in the signal-to-noise across the image 
plane, and N p (9) = ^ - WiWjA$(\9i — 0j\) is the effective number of pairs in 
the bin considered. 

APPENDIX F: RULES OF THE CHALLENGE 

The challenge will be run as a competition, and the winner will be awarded 
with a prize. 

We will award a prize to the participants with the highest quality factor 
in the Galaxy Challenge at the end of the submission period. Prizes will be 
awarded at a final GREAT10 meeting, and winners will be required to make 
a descriptive presentation of their method at this workshop. 

To define the scope of the competition we outline some participant rules 
here: 

(1) Participants may use a pseudonym or team name on the results leader 
board, however, real names (as used in publications) must be provided where 
requested during the result submission process. We will also require an email 
address to be provided, so that we can communicate GREAT10 information 
directly to participants. Participant details will be confidential, and no in- 
formation will be made available to any third party. 

(2) Participants who have investigated several algorithms may enter once 
per method. Changes in algorithm parameters do not constitute a different 
method. 

(3) Resubmissions for a given method may be sent a maximum of once per 
week per challenge during the 9 month competition. There will be allowed 1 
submission per week for: Star Challenge and Galaxy Challenge. If submission 
rates fluctuate, the submission time interval may be altered to accommodate 
the needs of the participants. 
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(4) Participants must provide a report detailing the method used, at the 
Challenge deadline. We will also provide a webpage where we will encourage 
participants to keep a log of their activities. If participants would like to 
provide code, this can also be uploaded to the webpage. 

(5) Any publication that contains results relating to the GREAT10 sim- 
ulations, that authors wish to submit during the 9 months of the challenge, 
must be approved by the GREAT10 PI (T. D. Kitching; tdk@roe.ac.uk) 
and GREAT10 Advisory Team before submission to any journal or online 
archive. 

(6) We expect all participants to allow their results to be included in 
the final Challenge Report. We will, however, be flexible in cases where 
methods performed badly if participants are against publicizing them. We 
will release the true shears and PSFs (and variation in power spectrum) 
after the deadline. 

(7) Participants are encouraged to freely write research articles using the 
Challenge simulations, after the submission of the GREAT10 results article. 
We especially encourage participants to submit articles on thier methods to 
the host journal for GREAT10 Annals of Applied Statistics. 

(8) The simulations may be updated during the challenge and/or mod- 
ified, if any improvements are required. If any modification occurs, partic- 
ipants will be notified by the email addresses provided at submission, and 
any changes will be posted on the GREAT10 website. 

GREAT10 team rules. The GREAT 10 Team is defined as the PI (T. D. 

Kitching), a GREAT10 Coordination Team whose role is to make decisions 
related to the input properties of the simulations, and a GREAT10 Advi- 
sory Team whose role is to advise on all other matters that do not directly 
influence the simulations (e.g., workshop activities). 

Some additional competition rules apply to members of the GREAT10 
Coordination Team and PI: 

(1) For the purpose of these rules, the GREAT10 Coordination Team is 
defined as being anyone who has participated in a GREAT10 Coordination 
Team meeting (in person, video or phone conference) or who has access to 
the GREAT10 Coordination website. 

(2) Only information available to non-GREATIO participants may be 
used in carrying out the analysis, for example, no inside information about 
the setup of the simulations may be used. Note that the true shear values will 
only be available to an even smaller subset of the GREAT10 Coordination 
Team and PI. 

(3) Any submission from the GREAT10 Coordination Team, or PI, will 
be highlighted in the GREAT10 results article, in a similar way to GREAT08 
Team members in Bridle et al. (2009). 
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The above rules do not apply to the members of the GREAT10 Team who 
are in the GREAT10 Advisory Team only. 
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